bears %>%
count(month) %>%
ggplot(aes(x = month, y = n)) +
geom_point() +
geom_line()Practice with ggplot2
Where should I put the aes() bit?
If you put it at the “top level” inside ggplot(aes(...)), the mapping will apply to all levels. For example:
In contrast, if you put the aes() mapping inside a single geometry layer, it will only apply to that layer. For example, this will cause an error since the geom_line() part doesn’t have an aesthetic mapping:
bears %>%
count(month) %>%
ggplot() +
geom_point(aes(x = month, y = n)) +
geom_line()#> Error in `geom_line()`:
#> ! Problem while setting up geom.
#> ℹ Error occurred in the 2nd layer.
#> Caused by error in `compute_geom_1()`:
#> ! `geom_line()` requires the following missing aesthetics: x and y
Main geoms
geom_point()
Basic scatterplot:
mpg %>%
ggplot() +
geom_point(aes(x = displ, y = hwy))Change color for all points:
mpg %>%
ggplot() +
geom_point(aes(x = displ, y = hwy), color = 'blue')To change color based on a variable, map the variable to color in aes():
mpg %>%
ggplot() +
geom_point(aes(x = displ, y = hwy, color = class)) Map the shape instead of color (usually not a great idea):
mpg %>%
ggplot() +
geom_point(aes(x = displ, y = hwy, shape = class)) What happened to SUV?
geom_line() vs. geom_smooth()
geom_line() connects all the dots:
mpg %>%
ggplot() +
geom_line(aes(x = displ, y = hwy))The reason this looks messy is because geom_line() is trying to literally connect every dot from left to right.
If you wanted a single “best-fit” trend line, use geom_smooth():
mpg %>%
ggplot() +
geom_smooth(aes(x = displ, y = hwy))Set se = FALSE to drop the error bounds:
mpg %>%
ggplot() +
geom_smooth(aes(x = displ, y = hwy), se = FALSE)geom_col()
For these examples, I’m creating a smaller summary data frame first that just counts how many rows there are for each class:
mpg %>%
count(class)#> # A tibble: 7 × 2
#> class n
#> <chr> <int>
#> 1 2seater 5
#> 2 compact 47
#> 3 midsize 41
#> 4 minivan 11
#> 5 pickup 33
#> 6 subcompact 35
#> 7 suv 62
Basic bar plot of the counts:
mpg %>%
count(class) %>%
ggplot() +
geom_col(aes(x = class, y = n), width = 0.7) # width is width of barsRe-order bars based on count using reorder():
mpg %>%
count(class) %>%
ggplot() +
geom_col(aes(x = reorder(class, n), y = n), width = 0.7)To change the color for all bars, use fill (not color):
mpg %>%
count(class) %>%
ggplot() +
geom_col(aes(x = reorder(class, n), y = n), fill = 'blue', width = 0.7)To change color based on a variable, map the variable to fill in aes():
mpg %>%
count(class, drv) %>% # Note I had to include drv in the count too
ggplot() +
geom_col(aes(x = reorder(class, n), y = n, fill = drv), width = 0.7) Use position = 'dodge' to change from stacked to side-by-side:
mpg %>%
count(class, drv) %>% # Note I had to include drv in the count too
ggplot() +
geom_col(
aes(x = reorder(class, n), y = n, fill = drv),
position = "dodge", width = 0.7) Practice
Facets
Facets make multiple small charts and are useful when you have many levels in a categorical variable.
For example, this plot has too many color categories for the color to be useful:
mpg %>%
ggplot(aes(x = displ, y = hwy)) +
geom_point(aes(color = class))Instead, we can use facet_wrap() to show multiple charts of each vehicle class:
mpg %>%
ggplot(aes(x = displ, y = hwy)) +
geom_point() +
facet_wrap(~class)You can also use facet_grid() to facet by two variables:
mpg %>%
ggplot(aes(x = displ, y = hwy)) +
geom_point() +
facet_grid(drv ~ cyl)Extra Practice
bears %>%
count(year, gender)#> # A tibble: 102 × 3
#> year gender n
#> <dbl> <chr> <int>
#> 1 1901 female 1
#> 2 1901 male 2
#> 3 1906 male 1
#> 4 1908 <NA> 1
#> 5 1916 male 1
#> 6 1922 male 1
#> 7 1929 female 1
#> 8 1929 male 2
#> 9 1930 male 1
#> 10 1932 male 3
#> # ℹ 92 more rows
mpg %>%
mutate(manufacturer = str_to_title(manufacturer)) %>%
group_by(manufacturer) %>%
summarise(mean_hwy = mean(hwy))#> # A tibble: 15 × 2
#> manufacturer mean_hwy
#> <chr> <dbl>
#> 1 Audi 26.4
#> 2 Chevrolet 21.9
#> 3 Dodge 17.9
#> 4 Ford 19.4
#> 5 Honda 32.6
#> 6 Hyundai 26.9
#> 7 Jeep 17.6
#> 8 Land Rover 16.5
#> 9 Lincoln 17
#> 10 Mercury 18
#> 11 Nissan 24.6
#> 12 Pontiac 26.4
#> 13 Subaru 25.6
#> 14 Toyota 24.9
#> 15 Volkswagen 29.2